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Abstract-The amount of energy 
consumed by domestic appliances is 
an important area of research. 
Hence, the main goal of this study is 
to produce very precise forecasts 
about energy consumption by home 
appliances using the least amount of 
processing power. The algorithms 
used in this study for predicting 
energy usage/consumption included 
regression, K-nearest neighbor, 
decision trees, and random forest. 
These algorithms were applied on the 
appliances’ energy prediction 
dataset made available for public use 
at the UCI Machine Learning 
Repository. To compare the data sets 
and choose the optimal machine 
learning (ML) algorithm for them, 
root mean square error (RMSE) was 
computed. 


Index Terms- energy consumption, 
prediction energy utility, root mean 
square error (RMSE), supervised 
machine learning. 


LIntroduction 


It is important to predict the 
amount of energy consumed by 
household electrical appliances, 
since improper appliance use wastes 
energy in the residential sector. 


Hence, an accurate assessment 
of energy demand in the housing 
sector. crucial in order to 
determine the amount of energy that 
may be saved. The amount of 
energy saved mostly depends on the 
type of device being used; for 
example, some devices may cause 
am imbalance state, others may 
operate more slowly, and some have 
fixed running times. The exact 
energy forecasting model, from the 
perspective of energy providers, 
may assist in determining the ideal 
time to employ various devices to 
lower total carbon emissions and 
also to save money. By making 
good use of consumer business 
models, agents may arrange the 
functioning of various gadgets. The 
energy forecasting solutions also 
provide consumers with an in-depth 
analysis of home energy 
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consumption pattems, allowing 
them to better manage and control 
their energy use and energy 
expenses. Reactive power 
management has been the subject of 
extensive research from the 
viewpoint of an industry consumer, 
although little research has been 
conducted about it from the 
standpoint of a household 
consumer. The analysis of 
residential energy use and customer 
behavior provides valuable insights 
that may help to develop more 
efficient energy consumption 
tactics. It is a difficult optimization 
problem to plan the operations of 
home appliances in various smart 
homes, since it is fundamentally a 
complicated nonlinear 
combinatorial issue [1]. 


Models that predict the energy 
consumption of home appliances 
have been the subject of several 
researches. The decision tree (DT) 
method, the decision table classifier 
(DTC), and the Bayesian network 
(BN)are a few examples of 
machine learning (ML) approaches 
used io provide a model for 
predicting the next-hour and next 
24-hour energy consumption of 
home appliances. These methods 
codify expert information on energy 
usage and provide a suitable data 
structure for the regressor. They 
also show how challenging it is to 
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select the optimum regression 
model for a particular dataset [2] 


ILLiterature Review 


Load estimation and forecasting 
are key factors when it comes to 
efficiently distribute power and 
keeping reserves for the future. In 
order to meet the electricity demand 
and not to resort to load shedding, 
elements of load forecasting 
mechanisms should be integrated 
with the techniques utilized by the 
dominant power utility companies. 
There are many ML techniques that 
can be utilized to achieve the desired 
results, So, an in-depth analysis was 
carried out using multiple ML 
algorithms in this paper to identify 
the best technique. The techniques 
used to differentiate between all 
these algorithms and to list their 
respective advantages © and 
advantages included mean 
absolute error (MAE), root mean 
square error (RMSE), and mean 
absolute percentage error (MAPE). 
The most efficient way determined 
was to not only use one of these 
algorithms independently but to 


combine them in — various 
combinations and use these 
combinations instead. This 


approach tentatively yields the most 
accurate results. Indeed, hybrid 
algorithms are the ones that work 
the best with the understanding that 
all algorithms have their specific 
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pros and cons. Load estimation 
Tequires very frequent checks on 
different load types. So, different 
technologies have been discussed 
with respect to different load 
horizons. As far as performance and 
accuracy are concerned, predictive 
models that are a combination of 
more than two existing models have 
been proven to be the most fruitful. 
Support vector machine (SVM), 
artificial neural network (ANN), 
and other relevant models have 
achieved a well-organized power 
system utility along with the 
minimum percentage of error. The 
authors of the current study 
conducted numerous tests to 
determine which techniques go well 
together in a hybrid model and yield 
the most accurate results [3]. 


World population is increasing 
day by day and with it the overall 
electricity consumption is also 
increasing. In the current situation, 
where supply struggles to meet 
demand, the best course of action is 
to implement those techniques 
which may help to predict the 
overall electricity consumption at 
amy given time in order to take 
necessary measures beforehand to 
meet the demand. The energy 
consumption trends throughout are 
nonlinear and remain dynamic. To 
predict short-term and long-term 
consumption with high accuracy, it 
is imperative to use machine 


learning along with distributed 
demand response programs. In the 
curent paper, the techniques 
discussed for this purpose include 
logistic regression (LR), support 
vector machine (SVM), naive Bayes 
(NB), decision tree classifier 
(DTC), K-nearest neighbor (KNN), 
and neural networks (NNs). The 
point is to propose a model well 
suited for the estimation and 
prediction of short term load 
forecasting (STLF). After extensive 
research and analysis, DTC was 
identified as the best technique. 
Enhanced DTC (ЕРТС) was 
proposed that utilizes integrated 
filter function, loss function, and 
gradient boosting to fine tune the 
already existing DTC mathematical 
model. The resulting ЕРТС 
algorithm yields a better forecasting 
result 


The forecasting and stabilizing 
of smart grid (SG) remains a 
challenge in today's landscape. ML 
comes into play when we discuss the. 
resilience of the SGs, Indeed, there 
аге different ML algorithms that 
have the capability to predict the 
future energy needs. Amongst all 
ihe suitable algorithms, selecting 
and adopting the best one poses a 
challenge. Numerous tests were 
conducted to select the best model 
and DTC outperformed all the other 
algorithms. EDTC was established 
to further amplify the predicting 
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capabilities of DTC. EDTC was 
found to be superior to SVM, KNN, 
NN, LR, and DT, when it came to 
accuracy, precision loss, and ROC 
curve metrics [2]. 


It is crucial to predict and 
schedule the energy needs in smart 
buildings (SBs) and to meet them 
accordingly in order to deploy 
energy-efficient management 
systems In the current paper, several 


approaches were explored and 
amongst them artificial neural 
network (ANN) and genetic 


algorithms remained in focus. To 
get the most accurate result, ANN 
was implemented in a real SB 
testbed. The tests were conducted 
using poly-voltaic panel installation 
and SB electronic appliances and 
the data was collected from them 
To implement ANN, the authors 
used CompactRIO and their model 
exhibited subpar prediction 
accuracy [4] 


In the proposed paper, authors 
built a model for accurate prediction 
of the energy consumption and also 
worked on the scheduling proces 
The proposed model utilized 
machine learning. The research was 
intended as a roadmap towards a 
better model which can achieve 
accuracy greater to that of this 
model 


The proposed model was 
implemented using python іп 
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LabVIEW making a new VI. In the 
model, blocks containing any SB 
appliance could be selected any time 
of the day. The model functioned on 
the basis of ANN. The algorithm 
was not very accurate as the dataset 
fed to the model was not big enough 
for it to make proper computations. 
Hence, training and validation 
remained a challenge. It was 
determined that ANN is not the best 
model when it comes to prediction 
[a]. 

In this day and age, STLF is 
needed to fulfil the demand of 
power consumption. This helps to 
predict the pattem on which the 
power system operates. The 
methods of load — forecasting 
employed in non-residential areas 
are based on customer demand or on 
experience and historical data. The 
best way to make the best prediction 
is to use machine learning. In the 
current paper, the authors sought the 
best ML model to generate an 
accurate short term algorithm for 
non-residential areas. For this 
purpose, the authors conducted 
experiments and rooted out the best 
model for industrial — users. 
Recurrent neural network based on 
sliding window approach turned out 
to be the best model for both short- 
term and long-term prediction. After 
three months of testing, the model 
that gave the best results was gated 
recurrent unit (GRU), with long- 


Volume 2 lee 1, Spring 20 


i&|UMT— 7 


Energy Prediction of Home, 


short term memory (LSTM) as the 
second best. GRU minimized 
5326.17 euros compared to LSTM 
in these three months and resulted in 
5.28% MAPE. The proposed model 
‘was made to justify and evaluate the 
gap between evaluation matrices 
and the impact of forecast errors in 
power market. The implementation 
involved three-month data of 
different ML algorithms. GRU 
turned out to be the best and the 
authors considered the data as 
sufficient [5]. 


Considering the overall increase 
in population and the depletion of 
energy resources around the world, 
we need to utilize our energy 
resources efficiently and develop a 
model to accurately predict the 
overall — energy — consumption 
according to the given factors. To 
figure out the best factors utilized 
the prediction process, univariate 
regression algorithm was employed 
by the authors. The algorithm 
predicted that the factors with the 
most impact were overall height, 
roof area, surface, and relative 
compaction. The models tested for 
the given factors were DT, RF, and 
K-NN. The testing of these 
algorithms was conducted оп 
Orange software. After extensive 
testing, the algorithm that gave the 
most accurate prediction was RF. 
‘The forecasting error was 1.128 and 
0404 for cooling load and heating 


loads, respectively. Research was 
conducted оп multiple ML 
algorithms and again the algorithm 
that yielded the best result was RF. 
The error rate determined from the 
testing turned out to be 0.404 and 
1.128 for heating and cooling loads, 
respectively. Contributing factors 
were determined to make the best 
prediction using univariate 
algorithm. Height was determined 
as the most notable feature that 
contributed towards the prediction 
of the overall power consumption 
[6]. 

Home energy management 
systems (HEMS) can be further 
enhanced by the use of load 
forecasting. This can be achieved by 
utilizing ^ machine learning 
considering the increase in the 
relevant data in the recent years. The 
current authors propose two 
methods for load forecasting, 
enhancing the traditional long short- 
term memory (S2S-LSTM) model. 
In the first method, three algorithms 
are applied: density based spatial 
clustering of applications with noise 
(DBSCAN), K-means, and Pearson 
correlation coefficient (РСС), 
Amongst all these techniques, PCC 
proved itself to be the best one, PCC 
was better at accommodating a large 
number of consumers. The second 
method constitutes an extension to 
method опе and increases iis 
performance. lt utilizes NN 
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architecture with softmax layers 
which are fully-connected, dropout, 
and stable. In LSTM, it further 
optimizes supervised learing 
which results in a more stable and 
accurate model for prediction. The 
findings were reached һу 
conducting an 8-week long research 
with 2337 consumers. 


In the current paper, two 
methods were proposed to make 
accurate predictions when it comes 
io energy consumption. These 
methods enhance S2S-LTSM model 
to make the predictions. Method one 
uses an amalgamation of human 
pattern recognitions which is 
extracted by three cluster analysis 
algorithm. Amongst all the 
algorithms that were employed PCC 
proved itself to be the best. The idea 
behind method one is to make a 
weight matrix by energy utilization 
habits and calculate the distance of 
the cluster. Three layer optimizing 
architecture was used to further 
enhance method one [7]. 


Electricity is a facility that is 
utilized daily by almost everyone 
and the lack of this energy will lead 
major disasters. We should generate. 
only the required amount of 
electricity for utilization and cannot 
make more that required because 
utilizing large sums of energy is not 
possible. The price that is associated 
with the energy depends on the 
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sources of the energy which in most 
cases are hydro-electric power 
plants, petroleum products, nuclear 
and wind energy plants. Under and 
overproduction are also the causes 
of the fluctuation in price, but the 
ones that contribute the most 
towards that fluctuation шс 
metrological parameters, economics 
and industrial activities. That's why 
load estimation needs to performed 
on a regional level which will help 
in efficiently manage, scheduling 
and planning. All of this would 
result in overall low cost In 
machine learning there are multiple. 
algorithms that can be used to 
accurately measure and estimate the. 
overall energy requirement. There 
are different algorithms that were 
used in the proposed paper. The 
different supervised learning 
algorithms were linear regression 
(LR), support vector repressor's 
(SVR), K-nearest neighbor (KNN), 
random forest (RF), and AdaBoost. 
The performance that was 
associated with all these algorithms 
varied with different times and data. 
To minimize the price per unit we 
utilize machine leaming with 
correlated metrological parameters 
kept in consideration. The model 
that was the focus of the proposed 
paper was least cost electric load 
forecasting model (ICELFM). The 
model was implemented һу 
minimizing root mean square error 
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(RMSE), mean absolute error 
(MAE), and mean absolute 
percentage error (MAPE). The data 
for testing was taken бот 
Muzaffarabad from start of January 
2014 till end of December 2015. 
Pakistan meteorological department 
provided author with 
meteorological time series data for 
the same period and time. The 
proposed model turned out to be the 


best when compared to other 
models. 
Human energy consumption 


patterns change consistently with 
changes in habits that develop due to 
the change in weather. This energy 
is generated using sources like 
water, petroleum, wind, and natural 
energy. Consumers want the price to 
be less whereas the providers want 
the profits to be maximum [8] 


AELF model was developed 
taking in account the weather 
conditions and human behaviors. 
This model is also used to reduce the 
price of electricity. The proposed 
Paper suggests а least cost 
estimation model and utilized 
meicorological parameters driven 
electrical load demand оѓ 
Muzaffarabad. The study suggested 
that meteorological parameters like 
temperature influence the overall 
consumption. This proved that 
factors like time and season 
drastically impact the consumption. 


The proposed model generated 
forecasting reduced prediction error 
for ELnMPFModels. There was a 
significant reduction in MAPE and 
with the implementation of the 
proposed model Muzaffarabad will 
save 0.303 million rupees daily. 
Although this study was conducted 
keeping Muzaffarabad in mind but 
the same model can be implemented 
in any city of Pakistan to get 
accurate estimation on load [4]. 


Estimation of the power 
consumption is a very important 
tasks when we are supposed to 
generated energy in advance or plan 
for the generation of power 
beforehand. With the 
implementation of smart grid, the 
need of energy consumption 
estimation is dire. Estimation of a 
future event is always a difficult task 
and to do it with high precession is 
an even bigger challenge. There 
have been many attempts at 
estimation of power consumption 
accurately but none of them very 
accurate. Machine leaming has 
multiple algorithm that are well 
suited for this task. Machine 
learning has been recognized to 
predict failure before it even occurs. 
Machine leaming is artificial 
intelligence (AI) which develop a 
model based studying a given data. 
The algorithms that were explored 
were artificial neural network 
(ANN), multiple linear regression 
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(MLR), adaptive neuro fuzzy 
interface system (ANFIS), and 
support vector machine (SVM). The 
criteria selected in proposed paper 
for power generation is Cyprus. 
Testing was done on real data 
accumulated over 2016 and 2017 for 
the motivation of using in long term 
and short term analysis. It was 
determined that the factors that 
affect the consumption of electricity 
the most are temperature, humidity, 
solar irradiation, population, gross 
national income (GNI) per capita, 
and the electricity price per 
kilowatt-hour. By doing multiple 
computations it was later discovered 
that SVM and ANN were superior to 
other machine learning algorithms 
which had fewer estimation error 


[9]. 


With smart grid implementation 
load estimation is more important 
than ever. The prediction of load in 
any given time is difficult 
considering the dynamic nature of 
the consumer. For prior planning of 
the energy consumption it is crucial 
to estimate it beforehand. In this 
study multiple methods were tested 
where ANN and SVM were more 
accurate and provided better 
estimation of the energy required 


по. 
IILResearch Methods 


ML algorithms, on which the 
data set was implemented and 
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executed, included decision trees 


(DT), support vector machine 
(SVM), logistics regression, linear 
regression,  K-nearest neighbor 


(KNN), and random forest (RF) 
SVM is a type of generalized linear. 


classifier that uses supervised 
learning to classify data into binary 
categories. SVM was first 


presented/introduced in 1964 and it 
grew in popularity during the 1990s, 
resulting in a number of enhanced 
and expanded algorithms. 
Regression issues may also be 
solved with SVM. KNN may be 
used to study regression by getting a 
sample's nearest neighbors and 
assigning the average of these 
neighbors’ properties to the sample 
in order to obtain the sample's 
properties. Another (improved) way 
is to assign different weights to the 
effects of neighbors at various 
distances from the sample, with the 
weight being inversely proportional 
to the distance. Among ML 
techniques, it was found that RF is 
faster in the training process and 
powerful formore effective in 
solving high dimensional data and 
complex problems in the industry. 
Its performance remains stable and 
accurate, due to which it creates 
multiple decision trees and 
combines them to produce output. 
Figure 1 summarizes the proposed 
methodology. 
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1V.Experimental Setups 
and Results 


‘The experimental setup included 
the tools needed to identify the best 
MLalgorithm for making prediction 
and doing anticipatory tasks. 


A. Data Set 

Data set consisted оГ 
temperature (measured in Celsius) 
and humidity (measured їп 


percentage). The data was collected 
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Fig. 1. Proposed method. 


by installing temperature sensor and 
humidity sensor in all the rooms of 
а low energy house. The values on 
the temperature sensor and humidity 
sensor were monitored with the help 
of ZigBee wireless sensor network. 
Each sensor transmitted the data 
(condition of temperature and 
humidity) for 3.3 mins, which was 
averaged for a 10 mins period. The 
data was monitored for every 10 
mins and logged in. The detail 


a 


sous and 


about the outside weather, pressure, 
humidity, wind speed, and visibility 
were taken from the weather station. 
This data was linked with the 
experimental data of temperature 
and humidity of each room with the 
help of date and time columns. The 
source of energy prediction for 
appliances data set is UCI Machine 
Leaming Repository. Table 1 
summarizes the characteristics of 
the dataset. 
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From a total of 19,735 data 
samples, 15,788 samples (80%) 


Table 3 shows the performance 
evaluation of the prediction models 


were randomly assigned to the used to measure the energy 
training set and 3,947 samples consumption ої household 
(20%) to the testing set. Table 2 appliances. RMSE of linear 
gives the data set description. regression is 0.046 (which is very 
low) It shows the better 
Table 1 
ЖЄ performance of this model as 
Characteristics of Dataset compared to other models. RF has 
"Chin Values RMSE of 0.047 and it also shows 
better performance than SVM and 
No oF daw oas detisic regression 
sample Table III 
No. of features 29 Model Performance 
x: Models ‘Accuracy RMSE 
‘Sampling 
à 10 min 
time/rate 
Linear 
B. Results Regression 0953 46 
АП prediction models ше 
regression models. Root mean Random 
square error (RMSE), mean Forest 0.939 бозу 
absolute error (MAE), mean square 
eror (MSE), and decision “Logistic 
coefficient (R2) are all regularly Regression 0722 "ne 
used metrics for assessing 
regression models. In this ур 
experiment, the RMSE assessment 0.740 10.37 
indicator was employed. It was used 
to calculate the difference between 
the observed value and true value. 
RMS-, 
Here, Yi is the real value of the 
data at time i and Yi’ is the predicted 
value of the data at time i. 
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Table П 
Dataset Variables and Description 


Feature # — Variable Description 

1 TI — Temperature in kitchen 

2 T2 Temperature in living room 

3 T3 Temperature in laundry room 

4 T4 — Temperature in office room 

5 TS Temperature in bathroom 

6 T6 Temperature outside the building 
7 T7 Temperature in ironing room 

8 T8 Temperature in teenager room 

9 T9 Temperature in parents’ room 

10 RI ty in kitchen 

п R2 Humidity in living room 

12 R3 Humidity in laundry room 

13 R4 ty in office room 

14 R5 Humidity in bathroom 

15 R6 Humidity outside the building 
16 R7 Humidity in ironing room 

17 R8 Humidity in teenager room 

18 R9 Humidity in parents’ room 

19 L Light energy consumption 

20 RO Humidity outside the airport 

21 Td Dew point temperature 

2 V Visibility 

23 W Windspeed 

24 TO Temperature outside the airport 
25 Tv] Random Variablel \ 
26 m2 Random Variable2 

27 P Pressure mmHG 
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Fig 2. Accuracy of Models 
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Fig 3. Root Mean Square Error of Models. 
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Fig 5. Energy Consumption of Lights 


m ICR Innovative Computing Review 


‘Volume? Issue 1, Spring 2022 


Gulnar 


теше 


Fig 6. Pressure and Humidity during Energy Consumption. 
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Fig 7. Graph Depicts Visibility (Scatter Plot) 
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V. Conclusion 


Prediction models for energy 
usage by home appliances based оп 
SVM, KNN, RF, linear regression, 
and logistic regression were 
investigated, Firstly, the authors 
reviewed the data pretreatment to 
remove certain features from the 
filtered data and to normalize it 
Secondly, the grid search technique 
was utilized to find the best 
parameters for the model and 
models based on several ML 
algorithms were created. Finally 
cach models prediction 
performance was tested and 
compared. The results revealed that 
linear regression and RF obtained 
good results in both the training and 
testing data sets, with the best 
prediction performance among the 
four prediction models developed 
using the classic ML approach. In 
the testing set, KNN, RF, and SVM 
all performed equally — well, 
however, SVM performed the 
poorest in the training data set. 
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